Associating telemetry data from a group of entities

ABSTRACT

Embodiments of the invention provide an ability to associate telemetry data received from different entities, such as guest and/or host machines residing on one or more particular physical computers (e.g., server computers) executing virtualization software. In some embodiments, telemetry data supplied by each entity includes information that identifies, and preserves the anonymity of, the entity (e.g., the computer(s) on which the guest and/or host machine(s) reside(s)). For example, if the entities comprise guest and/or host machines residing on a single computer, the information may comprise a one-way hash of the fully qualified domain name (FQDN) of the computer. If the entities are guest and/or host machines residing on a group of computers, the information may comprise a one-way hash of a portion of an FQDN for each computer which is common to all computers in the group. If the group of computers belong to a network domain having a globally unique identifier (GUID) (e.g., as employed by Microsoft Active Directory), the information may comprise a one-way hash of a portion of the GUID.

FIELD OF INVENTION

This invention relates to the automated collection of information storedon computers deployed in a client-server environment.

BACKGROUND OF INVENTION

Many organizations operate one or more server computers to performvarious computing tasks. Each server may communicate with one or moreclient computers over a network. For example, some server computerscommunicate with client computers within an organization over a localnetwork, and some server computers communicate with client computersover the Internet. Generally, a server computer executes at least oneoperating system, and one or more server applications may execute underthe control of each operating system. A server application may, forexample, carry out tasks on behalf of, or provide services to,applications running on client computers. One common example of a serverapplication is a web server application, which processes requests forinformation received from browser applications running on clientcomputers and provides information to the browser application responsiveto the requests.

In general, the services that a server is equipped to provide to aclient computer are defined by modules or components of serverapplications installed on the server computer. Overall, these servicesmay be thought of as “roles” which the server is capable of performing.A server may be equipped to perform a wide variety of roles. Forexample, depending on the application modules installed, a server mayfunction as a file server, print server, mail server, web applicationserver, terminal server, remote access and/or virtual private network(VPN) server, directory services server, streaming media server, orother server role. A server may perform any number of roles at a giventime.

In accordance with some conventional techniques, information relating tohow server application modules are installed and used on a servercomputer are collected. With some of these techniques, information(referred to herein as “telemetry information” or “telemetry data”) iscollected from the server computer and/or the applications thereon,stored on the server computer, and uploaded to an information collectionfacility (e.g., with the consent of the party that maintains the servercomputer). Once uploaded, telemetry information may be analyzed toenable, for example, server application providers to refine theirapplications to make them more useful and less error-prone over time.For example, if an application provider determines based on uploadedinformation that certain server roles are commonly implemented incombination, the provider may modify the application to allow the rolesin question to be more easily combined, or develop new features for onerole that complement features in another. In another example, a providermay help customers avoid problems associated with particularimplementations. For example, if a provider determines that a particularrole is commonly implemented in a server which is ill-configured tosupport its features (e.g., on a server with access to insufficientnetwork bandwidth), then the provider may suggest that customers avoidthis configuration.

Conventional telemetry data typically includes information that enablesan information collection facility to associate telemetry data comingfrom the same originating entity over time. For example, a servercomputer may include a particular identifier within each set oftelemetry data sent to an information collection facility, so that theinformation collection facility may use the identifier to determine thatdifferent sets of telemetry data received over time originated from thesame server computer. This identifier is typically constructed toobfuscate the identity of the server computer, so as to preserve theanonymity of the party that maintains and operates it, and to preservethat party's privacy with respect to the implementation and use ofserver hardware and software.

Virtualization is a technique whereby a computer's resources may bepartitioned into separate and isolated “virtual machines,” eachsimulating a different machine within the same physical computer.Virtualization enables multiple instances of the same, or different,operating systems to run on the same physical computer and preventsapplications running under the control of each operating system frominterfering with each other's operation. In a system that employsvirtualization, a virtual machine (also called a “guest machine”)includes an instance of an operating system (a “guest operatingsystem”), under the control of which one or more applications executewithin the virtual machine. Each guest operating system may makerequests to employ the computer's hardware to either a “host” operatingsystem (e.g., if each guest machine on the computer runs the sameoperating system), or a “virtual machine monitor,” or VMM (e.g., if thecapability to run multiple operating systems is provided.) In someconventional systems, guest machines are configured to provide telemetrydata to an information collection facility.

SUMMARY OF INVENTION

Applicants have appreciated that conventional systems are incapable ofassociating telemetry data received from different entities (e.g.,different guest and/or host machines residing on a particular physicalcomputer executing virtualization software, or different physicalcomputers), and that an ability to associate telemetry data originatingfrom different entities may provide valuable insight into how theseentities are configured and operate.

For example, an ability to associate telemetry data received fromdifferent guest and/or host machines implemented on the same physicalcomputer may provide the ability to compare the manner in which thedifferent guest and/or host machines, or the applications executingunder their respective control, are implemented and used. For example,the roles implemented by applications executing under the control ofdifferent guest operating systems on a particular server computer, orthe speed with which system operations are performed by one or moreguest operating systems and the host operating system on a physicalcomputer, may be compared and analyzed. Other types of analysis thatwould enable application developers to refine their products over timemay additionally be performed.

In accordance with some embodiments, telemetry data supplied by eachentity includes information that identifies, and yet preserves theanonymity of, the entity and/or the party that operates it. For example,if the entity is a guest or host machine residing on a physicalcomputer, the information may preserve the identity of the guest or hostmachine, the computer on which the guest or host machine resides, andthe party that operates the computer. For example, in some embodiments,each guest and/or host machine residing on a physical computer storesthe fully qualified domain name (FQDN) of the computer, generates aone-way hash of the FQDN (e.g., using the SHA-256 one way hashingalgorithm), and incorporates the hash into telemetry data that isuploaded to an information collection facility. Using a one-way hashingalgorithm may, for example, prevent an identification of the computer orits operator from the hash. The information collection facility, uponreceiving the telemetry data, may compare hashes received from variousguest and host machines. If the hashes provided by at least two guestand/or host machines match, the information collection facility maydetermine that the guest and/or host machines reside on the samephysical server computer.

Embodiments of the invention are not limited to a virtualizedenvironment, as some embodiments provide the ability to associatetelemetry data from different computers in the same group (e.g., anetwork domain). For example, in some embodiments, telemetry datacreated by each computer in a group includes information that identifiesthe group, but does not compromise the anonymity of any computer, thegroup or the operator thereof. For example, in some embodiments, eachcomputer may include within telemetry data a portion of an FQDN that isshared by each computer in the group, or, if the computers in the groupemploy Microsoft Active Directory, then the active directory GUID sharedby all of the computers in the domain may be used. To preserve theanonymity of each computer and the group, a one-way hash of theinformation may, for example, be generated and included within telemetrydata uploaded by each computer to an information collection facility.The information collection facility may use this information todetermine that the computers reside in the same group.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram depicting a system in which a computer havinga plurality of virtual machines residing thereon communicates with aninformation collection facility;

FIG. 2 is a flowchart depicting one example of a process whereby a guestand/or host machine may provide information to an information collectionfacility, in accordance with some embodiments of the invention;

FIG. 3 is a block diagram depicting a system in which different entitiesmay provide information to an information collection facility, inaccordance with some embodiments of the invention;

FIG. 4 is a block diagram depicting an example computer on whichembodiments of the invention may be implemented; and

FIG. 5 is a block diagram depicting an example memory on whichinstructions may be encoded which, when executed, may implementembodiments of the invention.

DETAILED DESCRIPTION

In accordance with some embodiments of the invention, a capability isprovided to associate telemetry data received from different entities,such as different guest and/or host machines residing on a particularphysical computer (e.g., a server computer). For example, in someembodiments, telemetry data supplied to an information collectionfacility by each guest and/or host machine residing on a particularcomputer includes information that identifies, but preserves theanonymity of, the computer. Any suitable information may be used forthis purpose, as embodiments of the invention are not limited to anyparticular implementation. In some embodiments, each guest and/or hostmachine on the computer stores the FQDN of the computer, generates aone-way hash of the FQDN (e.g., using the SHA-256 one-way hashingalgorithm, and/or any other suitable algorithm), and incorporates theresulting hash to telemetry data that is uploaded to an informationcollection facility. The information collection facility may use thehashes to associate the guest and/or host machines with each otherand/or the computer. Of course, embodiments of the invention are notlimited to employing a computer's FQDN, as any information which may beused to associate guest and/or host machines may be used, includinginformation which has no relationship to the computer. In addition,embodiments of the invention are not limited to employing a one-wayhashing algorithm to generate the information included within thetelemetry data, as any suitable technique may be employed to preservethe anonymity of the computer or its operator, if in fact anonymity isdesired at all.

Embodiments of the invention are also not limited to associatingtelemetry data received from entities residing on a single computer, assome embodiments provide the ability to associate data received fromdifferent computers, such as computers that belong to the same group(e.g., a network domain). In some embodiments, each computer includeswithin telemetry data that is uploaded to an information collectionfacility information that identifies the group, but does not compromiseits anonymity or that of the operator of any computer(s) in the group.For example, in some embodiments, each guest and/or host machine maystore a portion of an identifier for computers in the group which isshared by all members of the group. For example, if a portion of theFQDN for each computer in the group is shared by all members, then thatportion may be employed. Of course, embodiments of the invention are notlimited to being implemented in this manner, as any suitableidentifier(s) may be employed. For example, if all computers in thegroup reside in a network domain which employs Microsoft ActiveDirectory, produced by Microsoft Corp. of Redmond, Wash., then theActive Directory Global Unique Identifier (GUID), which is unique tocomputers in the domain, may be employed. Any one or more identifiersmay be employed, as the invention is not limited in this respect.

FIG. 1 depicts an example conventional system comprising a physicalcomputer 100 on which guest machines 105-1, 105-2 and 105-3 reside.Computer 100 communicates with information collection facility 150 vianetwork(s) 135, which may include any suitable communicationsinfrastructure and employ any suitable protocol(s). Although FIG. 1depicts only a single link between computer 100 and network 135, itshould be appreciated that any suitable number of physical and/orvirtual links (e.g., network adapters) may be employed (e.g., one foreach guest machine), as the invention is not limited in this respect.

Each guest machine 105 includes a corresponding guest operating system115, so that guest machine 105-1 includes operating system 115-1, guestmachine 105-2 includes operating system 115-2, and guest machine 105-3includes operating system 115-3. Each guest machine also includes anapplication 110 running under the control of its guest operating system,so that guest machine 105-1 includes application 110-1, guest machine105-2 includes application 110-2 and guest machine 105-3 includesapplication 110-3. Although only one application 110 is shown as runningunder the control of a respective guest operating system, it should beappreciated that any number of applications may execute in a guestmachine (including zero applications). Computer 100 also includes hostoperating system 120 which, in the example system shown, coordinatesaccess by each of guest machines 115 to hardware 125 on the computer.

In the system of FIG. 1, host operating system 120 and each of guestmachines 105 (e.g., guest operating systems 115 and applications 110)collect telemetry data regarding system operation. Some exampletechniques for collecting telemetry data are described in co-pendingU.S. patent application Ser. No. 11/253,256, which is incorporatedherein by reference. In general, each of the host operating system andguest machines may store telemetry data regarding the operation of thecomputer, the operating system, and/or the application(s) executingunder its respective control. The data may be stored, for example,within the registry maintained by each of the host operating system andthe guest machines, and/or in one or more other locations. One or morecollection components may collect the stored information (e.g., on aperiodic basis) and employ a transfer mechanism, such as the SoftwareQuality Management (SQM) transfer mechanism offered by Microsoft Corp.,or any other transfer mechanism, to transmit the information toinformation collection facility 150. As computer 100 is virtualized,each of host operating system 120 and guest operating systems 115-1,115-2 and 115-3 provide telemetry data separately to informationcollection facility 150, and the telemetry data provided by eachincludes no information usable to identify other guest machines or thecomputer. As such, information collection facility 150 is unable toassociate the telemetry data provided by any of the host operatingsystem or guest machines with data provided from the others, or thecomputer.

Embodiments of the invention provide an ability to associate telemetrydata provided by different guest and/or host machines residing on aphysical computer by including within the data from each guest and/orhost machine information which is usable to make this association. Asnoted above, the information may, for example, identify the computer onwhich each guest and/or host machine resides, in a manner whichpreserves the anonymity of the computer and its operator. As a result,telemetry data collected by guest and/or host machines may be moreintelligently analyzed without compromising the privacy of the computeror operator.

FIG. 2 depicts an example process 200 whereby telemetry data thatincludes information usable to associate guest and/or host machines maybe provided to an information collection facility. The example processshown in FIG. 2 depicts acts performed by a host machine (e.g., hostoperating system 120, FIG. 1), a guest machine (e.g., a guest operatingsystem 115 in one of guest machines 105) and an information collectionfacility (e.g., facility 150). Although the example process of FIG. 2includes acts performed by only one host machine and one guest machine,it should be appreciated that embodiments of the invention are not solimited, and that any suitable number of host and/or guest machines mayperform the acts described below.

At the start of process 200, the host machine reads one or more items ofuniquely identifiable information from the physical computer on which itresides in act 205. For example, the host machine may read the FQDN ofthe computer, from any storage location in which the FQDN is stored bythe physical computer. In act 220, the host machine generates a one-wayhash of the information read from the computer in act 205. This may beperformed in any of numerous ways, such as by executing programmedinstructions designed for this purpose. In some embodiments, a SHA-256one-way hashing algorithm may be employed to generate a hash of thecomputer's FQDN. If employed, the SHA-256 algorithm generates a 256-bithash value of the information which may be represented as an unsignedchar array of length 32. This char array may be converted to a 64-widechar string, wherein each UCHAR in the array is represented as 2hexadecimal characters.

It should be appreciated that generating a one-way hash of theinformation read in act 205 is but one example of a technique forproducing a derivation of the information that may be useful inpreserving the anonymity of the physical computer and/or its operator.Any suitable one or more techniques may alternatively be employed. If aone-way hashing algorithm is employed, the SHA-256 algorithm need not beused, as any of numerous other one-way hashing algorithms may beemployed. Of course, if preserving the anonymity of the computer and/orits operator is unimportant, then no derivation need be produced in act220. Embodiments of the invention may be implemented in any of numerousways, and are not limited in this respect.

In act 225, the one-way hash (e.g., the 64-wide char string) generatedin act 220 is included in telemetry data provided to the informationcollection facility. This may be performed in any of numerous ways. Forexample, the host machine and guest machine may each write the one-wayhash to a data point in a data stream which includes various othertelemetry data points provided to the information collection facility(e.g., on a periodic basis).

In act 230 the guest machine reads uniquely identifiable informationfrom the physical computer. As with act 205, this may be performed inany of numerous ways. For example, when virtualization software isinitiated on the physical computer and the guest machine is created, theinformation may be copied (e.g., by virtualization software) to theguest machine, and stored in any one or more locations accessible by theguest machine. As noted above, the information may comprise, as anexample, the FQDN of the computer.

In act 235, the guest machine generates a one-way hash of theinformation. As discussed above with reference to act 220 a SHA-256one-way hashing algorithm may be employed, although the invention is notlimited to such an implementation, as any one or more algorithms mayalternatively be used.

In act 240, the one-way hash generated in act 235 is included intelemetry data provided by the guest machine to the informationcollection facility. For example, the guest machine may write theone-way hash to a data point in a data stream that includes othertelemetry data uploaded to the information collection facility.

In acts 245 and 250, respectively, the information collection facilityreceives the data (including the information included in acts 225 and240) from the host machine and guest machine. In act 255, theinformation collection facility uses this information to associate thedata received from the host machine and guest machine. This may beperformed in any suitable fashion. Such as by comparing the informationreceived from the host machine with the information received from theguest machine to determine that the host machine and guest machinereside on the same physical computer (e.g., computer 100, FIG. 1).Although the example process of FIG. 2 includes an informationcollection facility receiving data from only a single host machine andguest machine, information may be collected from any number of hostand/or guest machines not represented in FIG. 2, as the invention is notlimited in this respect.

Process 200 then completes.

It should be appreciated that not all embodiments of the inventioninclude performing the acts described above in the specific sequencedefined by example process 200. For example, some embodiments of theinvention may include performing acts other than those described above,or may omit any one or more of the acts described above. For example,rather than the host machine providing information to the guest machineso that each may generate a one-way hash (or other derivation) of theinformation separately, the host machine could provide the hash to theguest machine. In addition, rather than the host machine reading theinformation from which a hash is generated (e.g., the FQDN) from thecomputer and providing it to one or more guest machines, each guestmachine may instead read the information directly from the computer.Numerous variations of example process 200 may be employed, asembodiments of the invention may be implemented in any of numerous ways.

It should further be appreciated that although example process 200comprises an information collection facility associating telemetry dataprovided by a single guest machine with that of a single host machine,embodiments of the invention may provide a capability to associatetelemetry data received from any number of guest machines residing on aphysical computer.

Telemetry data received by the information collection facility may beused in any of numerous ways. For example, telemetry data may beanalyzed to better understand how server roles are implemented and usedin virtualized environments; to better understand platform capabilitiesand hardware configurations of servers employing virtualizationsoftware; to better understand how virtualization software is used,configured, and performs; to identify trends in virtualization; and tocompare information received from applications executing in virtualizedand non-virtualized environments. Telemetry data received from guestand/or host machines in a virtualized environment may be analyzed in anyof numerous ways.

In some embodiments, telemetry data received from each guest machineincludes information descriptive of how the guest machine is implementedon the computer, which applications are installed and execute under theguest operating system (e.g., to implement different server roles), andhow (e.g., the speed and efficiency with which) system operations areperformed by the guest operating system and applications, so that thisinformation may be compared with information received from other guestmachines and a host machine. For example, information on how quicklycertain system operations may be performed by the host machine and oneor more guest machine may be compared to ascertain the efficiency withwhich a guest machine is able to access the computer's resources. As anexample, if the information received from a certain guest machineindicates that a particular file copy operation performed by anapplication executing under its control takes a certain period of time,and information received from a host machine indicates that the sameoperation takes less time when performed by the host operating system,then this may indicate to developers of the virtualization softwareand/or the application that modifications may help the applicationaccess system resources more efficiently. It should be appreciated thatthe above is but one example that may be received and compared. Otherexamples of information that may be transmitted and analyzed by aninformation collection facility are described in above-referencedco-pending application Ser. No. 11/253,256.

As noted above, embodiments of the invention are not limited toassociating telemetry data received from guest and/or host machinesresiding on a single computer, as some embodiments provide a capabilityto associate data received from guest and/or host machines residing onmultiple computers (e.g., a group of servers operated by anorganization). As a result, intelligence may be gleaned by comparing orcorrelating data received from various guest and/or host machinesresiding on a group of computers.

FIG. 3 depicts an example system in which computer 100 (also depicted inFIG. 1) resides in a group of computers which also includes computers300 and 350. Each of computers 100, 300 and 350 execute respectiveoperating systems 120, 320 and 355, respectively. Computer 100 includesthree guest machines 105-1, 105-2 and 105-3, and computer 300 includesguest machines 305-1 and 305-2, which include operating systems 315-1and 315-2, respectively. Computer 350 does not execute virtualizationsoftware, and so computer 350 does not include any guest machine.

Computers 100, 300 and 350 each communicate with information collectionfacility 150 via network(s) 135, which may include any suitablecommunications infrastructure and employ any suitable protocol(s).

In some embodiments, the host and guest machines on each of computers100 and 300 may perform a process similar to example process 200 (FIG.2), wherein information shared by the host and guest machine is includedin telemetry data provided by each to information collection facility150, except that in the system of FIG. 3, the information is common tohost and guest machines residing on these computers, so that theinformation included in telemetry data provided by guest machine 105-1is the same as, for example, that which is included in telemetry dataprovided by guest machine 305-1.

This may be accomplished in any of numerous ways, such as by using anidentifier or portion of an identifier which is shared by all computersin the group. For example, in some embodiments in which each computer ina group has an FQDN and a portion of the FQDN is common to all membersof the group, then the shared portion may be employed. For example, ifthe FQDN for computer 100 is host1.microsoft.com and the FQDN forcomputer 300 is host2.microsoft.com, then the portion microsoft.comshared by both computers may be employed. Of course, an FQDN portionneed not be used, as any suitable information may be employed. Forexample, in embodiments wherein the group of computers is a collectionof servers in a network domain which employ Microsoft Active Directory,the active directory GUID for the domain may be used by each server.

As described above with reference to FIG. 2, a one-way hash and/or otherderivation of this identifier may be generated (e.g., by each guestand/or host machine) and included in telemetry data transmitted toinformation collection facility 150. As a result, the informationcollection facility may associate data received from guest and/or hostmachines residing on each of the group of computers.

In addition, telemetry data provided by computer 350 to informationcollection facility 150 may include the information included intelemetry data by guest machines 105-1 and 305-1, so that theinformation collection facility may associate telemetry data receivedfrom each of these entities. For example, operating system 355 mayinclude within telemetry data the same information as that which isincluded in telemetry data by guest machines 105-1 and 305-1. Theinformation may, for example, be a one-way hash of information common toall members of the group of computers, such as a portion of the FQDNcommon to all members of the group, or the active directory GUID for thedomain.

It should be appreciated that although only one computer that does notexecute virtualization software (i.e., computer 350) is shown in FIG. 3,any number of computers not executing virtualization software may beincluded, and each may include within telemetry data provided to aninformation collection facility information that enables the informationcollection facility to associate the telemetry data provided by thecomputers.

Various aspects of the systems and methods for practicing features ofthe invention may be implemented on one or more computer systems, suchas the exemplary computer system 400 shown in FIG. 4. Computer system400 includes input device(s) 402, output device(s) 401, processor 403,memory system 404 and storage 406, all of which are coupled, directly orindirectly, via interconnection mechanism 405, which may comprise one ormore buses, switches, networks and/or any other suitableinterconnection. The input device(s) 402 receive(s) input from a user ormachine (e.g., a human operator), and the output device(s) 401display(s) or transmit(s) information to a user or machine (e.g., aliquid crystal display). The processor 403 typically executes a computerprogram called an operating system (e.g., a Microsoft Windows-familyoperating system, or any other suitable operating system) which controlsthe execution of other computer programs, and provides scheduling,input/output and other device control, accounting, compilation, storageassignment, data management, memory management, communication anddataflow control. Collectively, the processor and operating systemdefine the computer platform for which application programs and othercomputer program languages are written.

The processor 403 may also execute one or more computer programs toimplement various functions. These computer programs may be written inany type of computer program language, including a proceduralprogramming language, object-oriented programming language, macrolanguage, or combination thereof. These computer programs may be storedin storage system 406. Storage system 406 may hold information on avolatile or non-volatile medium, and may be fixed or removable. Storagesystem 406 is shown in greater detail in FIG. 5.

Storage system 406 typically includes a computer-readable and writablenonvolatile recording medium 501, on which signals are stored thatdefine a computer program or information to be used by the program. Amedium may, for example, be a disk or flash memory. Typically, anoperation, the processor 403 causes data to be read from the nonvolatilerecording medium 501 into a volatile memory 502 (e.g., a random accessmemory, or RAM) that allows for faster access to the information by theprocessor 403 than does the medium 501. The memory 502 may be located inthe storage system 406, as shown in FIG. 5, or in memory system 404, asshown in FIG. 4. The processor 403 generally manipulates the data withinthe integrated circuit memory 404, 502 and then copies the data to themedium 501 after processing is completed. A variety of mechanisms areknown for managing data movement between the medium 501 and theintegrated circuit memory element 404, 502, and the invention is notlimited thereto. The invention is also not limited to a particularmemory system 404 or storage system 406.

Further, embodiments of the invention are also not limited to employinga cache manager component which is implemented as a driver in the I/Ostack of an operating system. Any suitable component or combination ofcomponents, each of which may be implemented by an operating system orone or more standalone components, may alternatively or additionally beemployed. The invention is not limited to any particular implementation.

The above-described embodiments of the present invention can beimplemented in any of numerous ways. For example, the above-discussedfunctionality can be implemented using hardware, software or acombination thereof. When implemented in software, the software code canbe executed on any suitable processor or collection of processors,whether provided in a single computer or distributed among multiplecomputers. In this respect, it should be appreciated that any componentor collection of components that perform the functions described hereincan be generically considered as one or more controllers that controlthe above-discussed functions. The one or more controllers can beimplemented in numerous ways, such as with dedicated hardware, or byemploying one or more processors that are programmed using microcode orsoftware to perform the functions recited above. Where a controllerstores or provides data for system operation, such data may be stored ina central repository, in a plurality of repositories, or a combinationthereof.

Further, it should be appreciated that a (client or server) computer maybe embodied in any of a number of forms, such as a rack-mountedcomputer, desktop computer, laptop computer, tablet computer, or othertype of computer. Additionally, a (client or server) computer may beembedded in a device not generally regarded as a computer but withsuitable processing capabilities, including a Personal Digital Assistant(PDA), smart phone or any other suitable portable or fixed electronicdevice.

Also, a (client or server) computer may have one or more input andoutput devices. These devices can be used, among other things, topresent a user interface. Examples of output devices that can be used toprovide a user interface include printers or display screens for visualpresentation of output and speakers or other sound generating devicesfor audible presentation of output. Examples of input devices that canbe used for a user interface including keyboards, and pointing devices,such as mice, touch pads, and digitizing tables. As another example, acomputer may receive input information through speech recognition or inother audible format.

Such computers may be interconnected by one or more networks in anysuitable form, including a local area or a wide area network, such as anenterprise network and/or the Internet. Such networks may be based onany suitable technology and may operate according to any suitableprotocol and may include wireless networks, wired networks or fiberoptic networks. Also, the various methods or processes outlined hereinmay be coded as software that is executable on one or more processorsthat employ any one of a variety of operating systems or platforms.

Additionally, software may be written using any of a number of suitableprogramming languages and/or conventional programming or scriptingtools, and also may be compiled as executable machine language code orintermediate code that is executed on a framework or virtual machine.

In this respect, the invention may be embodied as a computer-readablestorage medium (or multiple storage media) (e.g., a computer memory, oneor more floppy disks, compact disks, optical disks; magnetic tapes,flash memories, circuit configurations in Field Programmable Gate Arraysor other semiconductor devices, and/or other computer storage media)encoded with one or more programs which, when executed on one or morecomputers or other processors, perform methods that implement thevarious embodiments of the invention discussed above. The storage mediumor media can be transportable, such that the program or programs storedthereon can be loaded onto one or more different computers or otherprocessors to implement various aspects of the present invention asdiscussed above.

The terms “program” or “software” are used herein in a generic sense torefer to any type of computer code or set of computer-executableinstructions that can be employed to program a computer or otherprocessor to implement various aspects of the present invention asdiscussed above. Additionally, it should be appreciated that accordingto one aspect of this embodiment, one or more computer programs thatwhen executed perform methods of the present invention need not resideon a single computer or processor, but may be distributed in a modularfashion amongst a number of different computers or processors toimplement various aspects of the present invention.

Computer-executable instructions may be provided in many forms, such asprogram modules, executed by one or more computers or other devices.Generally, program modules include routines, programs, objects,components, data structures, etc. that perform particular tasks orimplement particular abstract data types. Typically the functionality ofthe program modules may be combined or distributed as desired in variousembodiments.

Various aspects of the present invention may be used alone, incombination, or in a variety of arrangements not specifically discussedin the embodiments described in the foregoing and is therefore notlimited in its application to the details and arrangement of componentsset forth in the foregoing description or illustrated in the drawings.For example, aspects described in one embodiment may be combined in anymanner with aspects described in other embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc., in theclaims to modify a claim element does not by itself connote anypriority, precedence, or order of one claim element over another or thetemporal order in which acts of a method are performed, but are usedmerely as labels to distinguish one claim element having a certain namefrom another element having a same name (but for use of the ordinalterm) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose ofdescription and should not be regarded as limiting. The use of“including,” “comprising,” or “having,” “containing,” “involving,” andvariations thereof herein, is meant to encompass the items listedthereafter and equivalents thereof as well as additional items.

Having thus described several aspects of at least one embodiment of thisinvention, it is to be appreciated various alterations, modifications,and improvements will readily occur to those skilled in the art. Suchalterations, modifications, and improvements are intended to be part ofthis disclosure, and are intended to be within the spirit and scope ofthe invention. Accordingly, the foregoing description and drawings areby way of example only.

1. At least one tangible computer-readable storage medium article havinginstructions recorded thereon which, when executed, perform a methodcomprising: (A) receiving, by each of a plurality of guest and/or hostmachines implemented by at least one computer executing virtualizationsoftware, information descriptive of the at least one computer, theinformation received by each of the plurality of guest and/or hostmachines being the same information; (B) each of the plurality of guestand/or host machines generating a one-way hash of the information; and(C) each of the plurality of guest and/or host machines including theone-way hash of the information within its respective telemetry datagenerated for transmission to an information collection facility, toenable the information collection facility to form an associationbetween the telemetry data generated by each of the plurality of guestand/or host machines.
 2. The at least one tangible computer-readablestorage medium article of claim 1, wherein the act (A) comprises one ormore of the plurality of guest and/or host machines reading theinformation from the at least one computer.
 3. The at least one tangiblecomputer-readable storage medium article of claim 1, wherein the atleast one computer comprises a server computer.
 4. The at least onetangible computer-readable storage medium article of claim 3, whereineach guest machine executes at least one application implementing aserver role for the server computer.
 5. The at least one tangiblecomputer-readable storage medium article of claim 3, wherein theinformation comprises a fully qualified domain name (FQDN) of the servercomputer.
 6. The at least one tangible computer-readable storage mediumarticle of claim 1, further comprising acts of: (D) transmitting thetelemetry data for each of the plurality of guest and/or host machinesto the information collection facility; and (E) the informationcollection facility forming the association between the telemetry datagenerated by each of the plurality of guest and/or host machines.
 7. Theat least one tangible computer-readable storage medium article of claim1, wherein the act (B) comprises employing a SHA-256 one-way hashingalgorithm.
 8. A system comprising: at least one processor programmed toimplement an information collection facility configured to: receive,from at least one computer coupled to the information collectionfacility via at least one network, each of the at least one computersexecuting virtualization software implementing a plurality of guestand/or host machines configured to generate respective telemetry data,information usable to form an association between the telemetry datagenerated by each of the plurality of guest and/or host machines; andform an association between the telemetry data received from each of theplurality of guest and/or host machines.
 9. The system of claim 8,further comprising the at least one computer.
 10. The system of claim 8,wherein the information usable to form an association between thetelemetry data generated by each of the plurality of guest and/or hostmachines comprises a one-way hash of information included within thetelemetry data by each of the plurality of guest and/or host machines,and the information collection facility is configured to form theassociation between the telemetry data received from each of theplurality of guest and/or host machines using the one-way hash.
 11. Thesystem of claim 10, wherein the at least one computer is a plurality ofcomputers, each of the plurality of computers has an identifier of whichat least a portion is common to each of the plurality of computers, andthe information comprises the at least a portion of the identifier thatis common to each of the plurality of computers.
 12. The system of claim11, wherein the at least a portion of the identifier comprises a portionof an FQDN of each of the plurality of computers that is common to eachof the plurality of computers.
 13. The system of claim 11, wherein theplurality of computers each belong to a same network domain, and the atleast a portion of the identifier is a globally unique identifier (GUID)for the domain.
 14. The system of claim 8, wherein the at least onecomputer is a single computer, and wherein the information collectionfacility is configured to form an association between telemetry datareceived from guest and/or host machines implemented on the singlecomputer.
 15. The system of claim 8, wherein the at least one computeris a plurality of computers, and wherein the information collectionfacility is configured to form an association between telemetry datareceived from guest and/or host machines implemented on each of theplurality of computers.
 16. A method for use in a system comprising aplurality of computers each coupled to an information collectionfacility via at least one network, the method comprising: (A) each ofthe plurality of computers generating respective telemetry data fortransmission to the information collection facility, the telemetry datagenerated by each of the plurality of computers comprising a same itemof information which can not be used to identify any of the plurality ofcomputers but which can be used to form an association between thetelemetry data generated by each of the plurality of computers; and (B)transmitting the telemetry data generated by the plurality of computersto the information collection facility via the at least one network. 17.The method of claim 16, wherein at least one of the plurality ofcomputers executes virtualization software.
 18. The method of claim 16,further comprising an act of: (C) the information collection facilityreceiving the telemetry data generated by the plurality of computers;and (D) the information collection facility using the item ofinformation to form an association between the telemetry data generatedby the plurality of computers.
 19. The method of claim 16, wherein atleast one of the plurality of computers is a server computer.
 20. Themethod of claim 16, wherein the item of information generated by eachone of the plurality of computers comprises a one-way hash ofinformation descriptive of the one computer.